9  Acute Inpatient Stays

In this section we describe how to create acute inpatient encounters (i.e. hospital stays) from medical claims data. At the highest level this involves two steps:

  1. Identifying claims that occurred during an acute inpatient hospital stay
  2. Merging these claims into a single encounter

We go into significant detail about how this is done in the Tuva data model and how to use data tables in the data model to identify possible data quality problems and perform analytics.

First we need to connect to our database. We do this with the following connection.

Code
%%capture

import snowflake.connector
import pandas as pd

# Connect to Snowflake with SSO
conn = snowflake.connector.connect(
    user="aaron@tuvahealth.com",
    account="ksa27360.us-east-1",
    authenticator="externalbrowser",
    warehouse="COMPUTE_WH",
    database="medicare_lds_five_percent",
    schema="PUBLIC",
    role="accountadmin"
)

9.1 Identifying Acute Inpatient Claims

The first step in building acute inpatient encounters is identifying claims that occurred during acute inpatient hospital stays. We need to do this for both institutional and professional claims.

The following fields are commonly used to identify acute inpatient institutional claims: - Bill type code in (11X, 12X) - Any valid MS- or APR-DRG - Room and board revenue center codes

Bill type codes equal to 11X or 12X should only be found on acute inpatient claims. The same is true for any valid MS- or APR-DRGs. However, room and board revenue center codes can be found on a wide variety of inpatient claims (e.g. SNF, inpatient rehab, etc.).

Given these fields are essential for this analysis, we need to quickly assess whether they have any data quality problems. For bill type and DRG fields we need to check the following issues: - Missing: Every claim line should have a value populated - Invalid: Every value should exist in the official terminology table - Duplicate: Every claim should have one and only one value

By comparison, revenue center codes should meet the missing and invalid criteria, however, each institutional claim can and typically does have more than one value.

The data quality summary in acute inpatient can quickly reveal whether these fields have any of these problems. Let’s query this table and take a look.

Code
%%capture
# Query the acute IP data quality summary table and print a dataframe with the relevent rows.
query = "select * from input_layer.medical_claim limit 100000"
df = pd.read_sql(query, conn)
Code
import pandas as pd
import plotly.express as px

# Normalize column names to upper case
df.columns = df.columns.str.upper()

# Ensure CLAIM_START_DATE is a datetime field
df['CLAIM_START_DATE'] = pd.to_datetime(df['CLAIM_START_DATE'])

# Extract month and year for grouping
df['YEAR'] = df['CLAIM_START_DATE'].dt.year
df['MONTH'] = df['CLAIM_START_DATE'].dt.strftime('%b')  # Short month name
df['YEAR_MONTH'] = df['CLAIM_START_DATE'].dt.to_period('M')

# Group by month-year and count unique claims
monthly_claims = df.groupby(['YEAR_MONTH', 'YEAR', 'MONTH'])['CLAIM_ID'].nunique().reset_index()
monthly_claims['YEAR_MONTH'] = monthly_claims['YEAR_MONTH'].dt.to_timestamp()

# Create the trend chart with Plotly
fig = px.line(
    monthly_claims,
    x='YEAR_MONTH',
    y='CLAIM_ID',
    title='Monthly Claim Volume Trend',
    labels={'CLAIM_ID': 'Number of Unique Claims'},
    markers=True
)

# Customize the layout
fig.update_layout(
    title_font_size=20,
    xaxis_title='',
    yaxis_title='Number of Unique Claims',
    xaxis_tickangle=0,
    template='plotly_white',
)

# Update x-axis ticks to display months and years
fig.update_xaxes(
    tickformat='%b',  # Show abbreviated month names
    dtick='M1',       # Set ticks for every month
    ticklabelmode='period',  # Period-aligned ticks
    showgrid=False,
)

# Display the chart
fig.show()
Code
# Close the connection
conn.close()
Code
# this is a test